Skip to content

implement dpnp.piecewise #2550

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open

implement dpnp.piecewise #2550

wants to merge 6 commits into from

Conversation

vtavana
Copy link
Collaborator

@vtavana vtavana commented Aug 3, 2025

In this PR, dpnp.piecewise is implemented.

  • Have you provided a meaningful PR description?
  • Have you added a test, reproducer or referred to an issue with a reproducer?
  • Have you tested your changes locally for CPU and GPU devices?
  • Have you made sure that new changes do not introduce compiler warnings?
  • Have you checked performance impact of proposed changes?
  • Have you added documentation for your changes, if necessary?
  • Have you added your changes to the changelog?

@vtavana vtavana self-assigned this Aug 3, 2025
Copy link
Contributor

github-actions bot commented Aug 3, 2025

View rendered docs @ https://intelpython.github.io/dpnp/pull/2550/index.html

Copy link
Contributor

github-actions bot commented Aug 3, 2025

Array API standard conformance tests for dpnp=0.19.0dev3=py313h509198e_10 ran successfully.
Passed: 1227
Failed: 0
Skipped: 9

@vtavana vtavana added this to the 0.19.0 release milestone Aug 4, 2025
@vtavana vtavana marked this pull request as ready for review August 4, 2025 18:47
@coveralls
Copy link
Collaborator

coveralls commented Aug 5, 2025

Coverage Status

coverage: 72.101% (-0.007%) from 72.108%
when pulling fd58e42 on impl-piecewise
into 77a90bf on master.

if isinstance(func, dpnp.ndarray):
func = func.astype(x.dtype)

# TODO: possibly can use func.item() to make sure that func is always
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to use dpnp.where here instead? Or are there any limitations or performance concerns against that?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using dpnp.where instead of kernel is ~2x slower

import numpy, dpnp
a = numpy.linspace(-5, 5, 4096*4096)
%timeit numpy.piecewise(a, [a < 0, a >= 0], [-1, 1])
# 91.4 ms ± 148 μs per loop (mean ± std. dev. of 7 runs, 10 loops each)
ia = dpnp.array(a)
%timeit res = dpnp.piecewise(ia, [ia < 0, ia >= 0], [-1, 1]); res.sycl_queue.wait()
# 1.64 ms ± 4 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
# 3.06 ms ± 1.17 ms per loop (mean ± std. dev. of 7 runs, 100 loops each) # using dpnp.where

But it keeps the code simple and supports broadcasting
To improve performance, you can try to avoid dpnp.zeros_like() and dpnp.full() in condlen == funclen and condlen + 1 == funclen cases.

Instead, create one def_value, allocate result once dpnp.empty_like(...), and update it in-place with dpnp.where(... out=result) in the loop using any flags to detect the first iteration
First call - dpnp.where(condition, func, def_value, out=result) to fill non-matching elements with def_value
Other calls - dpnp.where(condition, func, result, out=result) to keep existing values and overwrite only where condition is True.

Then I get on PVC for float32

# 5.97 ms ± 109 μs per loop (mean ± std. dev. of 7 runs, 100 loops each) # kernel
# vs
# 6.47 ms ± 8.89 μs per loop (mean ± std. dev. of 7 runs, 100 loops each) # optimized where 

if isinstance(condlist, dpnp.ndarray) and condlist.ndim in [0, 1]:
condlist = [condlist]
elif dpnp.isscalar(condlist) or (
dpnp.isscalar(condlist[0]) and x.ndim != 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no check that condlist is exactly a list. Do we need that or it is assuming to accept any sequence?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In practice, condlist can be a list, tuple or ndarray in both NumPy and CuPy. And dpnp follows the same practice.

import numpy
a = numpy.linspace(-2.5, 2.5, 6)
numpy.piecewise(a, [a < 0, a >= 0], [-1, 1])  # condlist is list
# array([-1., -1., -1.,  1.,  1.,  1.])

numpy.piecewise(a, (a < 0, a >= 0), [-1, 1])  # condlist is tuple
# array([-1., -1., -1.,  1.,  1.,  1.])

numpy.piecewise(a, numpy.array([a < 0, a >= 0]), [-1, 1])  # condlist is array
# array([-1., -1., -1.,  1.,  1.,  1.])

import cupy
x = cupy.array(a)

cupy.piecewise(x, [x < 0, x >= 0], [-1, 1])  # condlist is list
# array([-1., -1., -1.,  1.,  1.,  1.])

cupy.piecewise(x, (x < 0, x >= 0), [-1, 1])  # condlist is tuple
# array([-1., -1., -1.,  1.,  1.,  1.])

cupy.piecewise(x, cupy.array([x < 0, x >= 0]), [-1, 1])  # condlist is array
# array([-1., -1., -1.,  1.,  1.,  1.])

import dpnp
ia = dpnp.array(a)
dpnp.piecewise(ia, [ia < 0, ia >= 0], [-1, 1])  # condlist is list
# array([-1., -1., -1.,  1.,  1.,  1.])

dpnp.piecewise(ia, (ia < 0, ia >= 0), [-1, 1])  # condlist is tuple
# array([-1., -1., -1.,  1.,  1.,  1.])

dpnp.piecewise(ia, dpnp.array([ia < 0, ia >= 0]), [-1, 1])  # condlist is array
# array([-1., -1., -1.,  1.,  1.,  1.])

"Execution queue is not compatible with allocation queue.");
}

const bool is_result_c_contig = result.is_c_contiguous();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think condition should also be C-contig to keep the kernel running correctly

In [1]: import dpnp

In [2]: import numpy

In [3]: a = numpy.arange(8).reshape(2,2,2)

In [4]: cond = numpy.array(a>3, order="F")

In [5]: a_dp = dpnp.array(a)

In [6]: cond_dp = dpnp.array(cond, order="F")

In [7]: numpy.piecewise(a, [cond], [1])
Out[7]: 
array([[[0, 0],
        [0, 0]],

       [[1, 1],
        [1, 1]]])

In [8]: dpnp.piecewise(a_dp,[cond_dp],[1])
Out[8]: 
array([[[0, 1],
        [0, 1]],

       [[0, 1],
        [0, 1]]])

"Condition and result arrays must have the same dimension.");
}

if (!dpctl::utils::queues_are_compatible(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use check_queue from common validation utils?

}

const bool is_result_c_contig = result.is_c_contiguous();
if (!is_result_c_contig) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same
check_c_contig

if (!is_result_c_contig) {
throw py::value_error("The result array is not c-contiguous.");
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we check for no memory overlap?

if isinstance(condlist, dpnp.ndarray) and condlist.ndim in [0, 1]:
condlist = [condlist]
elif dpnp.isscalar(condlist) or (
dpnp.isscalar(condlist[0]) and x.ndim != 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to raise an error like ValueError if condlist is empty?

dpnp.piecewise(x, [], [1])
# output
IndexError: list index out of range

f"with {condlen} condition(s), either {condlen} or {condlen + 1} "
"functions are expected"
)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please handle this case as well

x = dpnp.ones(3)
dpnp.piecewise(x, [x>0], dpnp.array([[1,2,3]]))
# output 
RuntimeError: Unable to cast Python instance of type <class 'dpnp.dpnp_array.dpnp_array'> to C++ type '?' (#define PYBIND11_DETAILED_ERROR_MESSAGES or compile in debug mode for details)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants